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Introduction, 


The  present  work  develops  an  analysis  for  a  particular  kind  of  data 
called  the  Q-sort.  Most  commonly  occurring  in  connection  with  personality 
assessment,  this  data  is  typically  generated  as  follows:  A  rater  is  pre¬ 
sented  with  a  deck  of  cards  -  called  the  Q-deck  or  Q-set  —  on  which  are 
written  different  descriptive  statements.  The  rater  is  told  to  order  the 
cards  according  to  some  criterion.  As  a  very  common  example,  the  cards 
of  the  Q-deck  might  have  different  descriptions  of  personality  and  the 
rater  would  be  asked  to  order  them  according  to  their  similarity  to  the 
personality  of  a  designated  individual  —  the  subject.  Although  occasion¬ 
ally  the  rater’s  task  is  to  completely  order  the  cards  —  called  Q-items  or 
simply  items  —  this  procedure  becomes  far  too  demanding  as  the  number  of 
items  increases.  In  the  latter  instance,  the  rater  is  asked  to  classify 
each  item  according  to  its  degree  of  concordance  with  the  subject,  ties 
permitted,  thereby  making  the  rater’s  task  tractable.  However,  in  order 
to  enforce  the  similarity  of  this  simplified  task  to  the  more 
difficult  task  of  completely  ordering  the  items,  the  so-called  forced 
distribution  is  imposed.  Under  this  restriction,  the  number  of  items 
that  the  rater  may  assign  to  any  rank  is  fixed. 

For  example,  the  number  of  Q-items  in  the  deck  is  often  100.  Nine 
categories  of  similarity  might  be  used,  ranging  from  1  —  ’’most  uncharac¬ 
teristic,”  through  5  —  "neither  characteristic  nor  uncharacteristic”,  up 
to  9  -  "most  characteristic”.  The  number  permitted  in  each  of  the  nine 
categories  might  then  be  5,  8,  12,  16,  18,  16,  12,  8,  5,  respectively.  Thus, 
exactly  five  items  would  be  forced  to  be  rated  "most  uncharacteristic”. 
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eight  other  items  forced  to  be  rated  at  the  next  most  uncharacteristic 
level,  and  so  on.  The  rankings  of  the  deck  as  a  whole  is  called  a  Q- 
sort . 


The  rationale  for  such  a  forced  distribution  is  that,  like  the 
complete  ordering  of  the  items,  the  moments  of  the  distribution  of  scores 
of  the  items  of  any  Q-sort  are  fixed.  Noting  that,  in  particular,  with¬ 
in  each  subject  the  mean  and  variance  of  the  items'  scores  are  fixed,  Q- 
sort  data  is  sometimes  described  as  being  "standardized  within  subjects” 
as  opposed  to  being  "standardized  within  variables”,  the  consequence  of 
imposing  the  more  usual  location  and  scale  invariance  on  a  set  of  multi¬ 
variate  data. 

The  feature  of  Q  that  the  psychometric  community  considers  distin¬ 
guishing  is  usually  described  in  terms  of  the  matrix  of  data,  X,  whose 
rows  represent  the  different  subjects  and  whose  columns  represent  the 
different  scores  of  the  items.  Thus,  X  ^  ^  is  the  score  given  to  the 
i-th  item  in  the  forced  distribution  describing  the  j-th  subject.  The 
common  and  familiar  practice  is  to  standardize  the  matrix  X  by  arguing 
that  one's  inferences  ought  not  depend  on  the  overall  level  of  the  item 
(or  variable)  i;  that  is,  one  ought  be  invariant  to  X  .  Similarly,  the 
second  central  moment  of  the  i-th  item  is  usually  considered  an  invariant. 

If  one  denotes  the  standardized  version  of  X  by  Z,  note  that  the  correlation 
matrix  of  the  items  corresponding  to  X  is  simply  R  -  —■  Z?Z.  One  might  then 
decompose  this  matrix  R  into  factors  that,  appropriately  rotated,  would 
reveal  groups  of  similar  items. 

In  contrast,  with  Q-sort  data  the  quantities  X.  are  all  constant 

J  • 

and  equal  to  the  mean  of  the  forced  distribution,  as  are  the  analogous 
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second  moments*  With  this  observation,  one  might  as  well  standardize  X 
—  12 

such  that  X  *  0  and  —  E  X  .  m  1.  Calling  this  transformed  matrix 
j  •  1  l  J 1 

Y,  by  analogy  to  the  correlation  matrix  of  the  items,  one  considers  the 
matrix  Q  *  y  YY*,  the  "correlation  matrix  of  subjects/*  The  matrix  Q 
may  then  be  decomposed  by  factor  analytic  techniques  to  obtain  factors  or 
’'clusters*'  of  similar  subjects*  For  this  reason,  Q  methodology  is  some¬ 
times  considered  as  a  competitor  to  cluster  analyses,  or,  rather  as  a 
forerunner  of  historical  interest.  Q  does  not  explicitly  formulate  this 
problem  as  a  clustering  problem;  as  such,  this  methodology  is  rarely  used 
(Overall  and  Klett  [1972]). 

However,  the  factor  analysis  of  subj ect -standardized  Q  data  was  not 
the  only,  nor  even  the  primary,  proposal  made  by  the  innovator  of  Q, 

William  Stephenson.  His  more  fundamental  contribution  was  the  methodology 
whereby  the  Q-deck  was  itself  constructed.  These  Q-decks  are  called 
structured,  and  are,  historically,  the  first  kind  of  Q-sets  employed. 

The  starting  point  of  the  structured  Q-set  is  the  psychological  theory 
whose  validity  is  being  investigated.  In  the  area  of  personality  theory, 
the  type  psychologies  furnish  the  simplest  examples.  In  such  schemes, 

Q-items  are  chosen  to  represent  different  types  postulated  by  a  particular 
theory.  By  such  a  deliberate  procedure,  a  design  matrix  Q  can  be  designated  as 
corresponding  to  the  structure  of  the  Q-deck.  Stephenson  himself  typically 
created  multiway  cross-factorial  designs,  taking  pains  to  "balance**  the 
structure  by  ensuring  each  cell  in  such  a  design  had  an  equal  number  of 
representative  items.  The  reader  is  referred  to  Kerlinger  (1972),  for  a 
detailed  description  of  Stephenson’s  structured  Q-sort  methodology. 

At  the  same  time  as  Stephenson  was  developing  various  aspects  of  his  Q- 
technique,  an  alternative  paradigm  for  questionnaire  construction  was  becoming 
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widely  accepted.  This  paradigm  was  based  on  the  dual  concepts  of  validity 
and  reliability,  each  of  which  was  in  turn  refined  into  secondary  levels  — 
construct  validity,  concurrent  validity,  interrater  reliability,  intrarater 

reliability  and  so  on.  These  concepts  as  a  whole  were  integrated  by 
Cronbach  et  al  (1973)  into  a  theory  of  generalizability . 

Stephenson’s  structured  Q-set  (Stephenson  [1953])  failed  to  success¬ 
fully  compete  with  the  requirements  of  generalizablity  theory.  The  methodolo¬ 
gical  problems  regarding  its  validity  and  reliability  (e.g.  Sundland  [1962]) 
were  sufficient  to  greatly  restrict  its  use.  In  fact.  Block  (1961)  was 
able  to  substantially  alter  the  scope  of  Q-studies  by  responding  to  these 

issues  of  validity  and  reliability;  the  result  was  his  unstructured 
California  Q-set.  Only  after  Block’s  work  did  Q  become  identified  exclu¬ 
sively  as  the  kind  of  f actor/cluster  analysis  described  above;  other  inno¬ 
vations  of  Stephenson,  especially  his  structuring  of  Q-sets,  received  less 
attention . 


Although  the  present  work  ultimately  develops  recommendations  for 
unstructured  Q-sort  data,  its  fundamental  import  is  a  parametric  model 
for  structured  Q-sorts.  Key  to  the  development  of  this  analysis  is  the 
derivation  of  a  sampling  (or  probability)  function.  The  sampling  function 
that  is  derived  describes  the  probability  that  a  given  individual  will 
give  any  particular  response  (i.e.  any  particular  ordering  of  Q-items) . 
This  object  ties  the  Q-sort  to  other  preference  ordering  and  selection 
models.  Because  an  analogous  preference  ordering  problem  was  posed  and 
then  solved  by  Luce  (1959),  a  brief  description  of  some  of  Luce’s  results 
are  presented  prior  to  the  main  body  of  chapter  I.  Against  this  background 
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an  axiom  similar  to  Luce’s  choice  axiom  is  postulated;  from  this  axiom  the 
functional  form  of  the  sampling  function  is  derived. 

In  chapter  II  the  sampling  function  is  reparametrized  to  make  the 
statistical  model  parsimonious.  The  essential  problem  posed  by  the 
structured  Q-set  -  the  relation  of  the  item  design  matrix  to  the  subject 
design  matrix  —  is  implicit  in  this  reparametr ization.  In  addition, 
various  modifications  to  the  sampling  function  are  proposed  to  facilitate 
its  computation.  Each  of  these  modifications  transforms  the  sampling 
function  into  a  kind  of  conditional  sampling  function. 

On  the  basis  of  chapter  II,  conditional  likelihood  functions  can  be 
formed.  In  chapter  III,  these  conditional  likelihoods  become  the  objective 
functions  which,  when  maximized,  furnish  estimates  of  the  parameters.  The 
consistency  and  asymptotic  normality  of  these  estimates  are  immediate  con¬ 
sequences  of  Andersen  (1970) . 

Chapter  IV  illustrates  the  manner  in  which  the  results  of  the  previous 
chapters  help  to  solve  the  inferential  problems  of  the  structured  Q-studies. 
Interestingly,  while  the  evaluation  of  the  significance  of  "nuisance"  effects 
conforms  to  the  framework  of  the  generalized  likelihood  ratio  tests,  the 
central  hypothesis  of  structured  Q-studies,  the  retrospective  validity  of 
the  Q-set,  does  not.  A  modification  is  proposed  that  enables  the  evaluation 
of  this  hypothesis. 

Chapter  V  develops  a  latent  factor  model  appropriate  for  the  analysis 
of  unstructured  Q-sorts.  This  model  compares  to  that  for  the  structured 
Q-sort  as  the  usual  multivariate  factor  model  (Anderson  [1958],  chapter  11) 
compares  to  the  multivariate  general  linear  model  (Anderson  [1958],  chapter 
8)  .  Chapter  VI  presents  an  example  that  illustrates  the  kind  of  analysis 
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consequences  of  the  results  of  the  present  work. 
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!•  Derivation  of  the  sampling  function. 

The  present  chapter  develops  a  model  that  idealizes  the  process 
by  which  a  single  individual  sorts  the  Q-set.  This  model  describes 
the  stochastic  process  of  the  sorting  —  but  only  in  a  sense.  For 
unlike  a  modeling  of  the  sorting  process  per  se»  this  idealization  does 
not  depend  on  any  initial  conditions,  e.g.  the  initial  ordering  of  the 
Q-items;  therefore  it  is  considerably  simpler. 

I. A.  Luce's  theory  of  choice  behavior. 

The  model  of  the  sorting  process,  with  its  derivation,  is  in  many 
ways  parallel  to  that  of  Luce  (1959),  who  idealized  the  process  of 
choosing  the  single  most  preferred  object  from  among  N  such  objects. 
Indeed,  by  developing  a  model  of  the  Q-sorting  process  a  new  perspective 
is  gained  on  Luce’s  model;  a  perspective  not  found  in  the  literature  of 
mathematical  psychology,  including  the  latter-day  review  of  Luce  (1977). 
To  facilitate  a  comparison,  Luce’s  work  is  briefly  reviewed  in  this 
section. 

I.A.l.  Notation  for  Luce's  model. 

Let  T  *  {x,y,z, .  . .  ,t)  be  the  (finite)  set  of  objects  under 
consideration.  T  is  referred  to  as  the  universe. 

Suppose  A  c  S  c  T.  Let  PC(A)  denote  the  probability  that  the 
object  chosen  as  most  preferred  is  an  element  of  A  when  the  selection 
offered  was  all  elements  in  S. 

I. A. 2.  Luce’s  choice  axiom  and  its  consequences. 

Luce's  axiom  consists  of  the  following  assertion: 
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I. A. 2(1) 


Pt(A)  =  Pg (A)  PT(S)  for  all  UScT, 

a  statement  very  similar  to  that  of  conditional  probability,  were 
P„(A)  =  P(a(s).  (Unlike  the  rules  of  conditional  probability,  the 

l) 

axiom  I. A. 2(1)  applies  only  to  nested  sets,  ACS  CT.) 

A  derived  but  equivalent  form  of  the  axiom  is 


Pjx)  P  (x) 

— Vt  =  p  7  T  for  all  S 
PT{y)  Pgiy) 


such  that 


(x,y}eS  , 


I .A. 2(2) 


subject  to  regularity  conditions  that  prevent  division  by  zero.  As 
a  direct  consequence  of  I. A. 2 (2),  one  may  conclude  there  exists  a 
function  v:  T  ^  (0,°°),  unique  up  to  changes  in  scale,  such  that 


PT(x}  Pg{x) 

P^XyT  *  P^TyT 


y(x) 
v(y)  ’ 


I. A. 2(3) 


whence 


Pg{X} 


V(x) 

l  v(u)  • 
ueS 


I.A.3.  Interpretation  of  Luce’s  axiom. 

Luce’s  axiom,  in  the  form  of  I. A. 2(2),  is  sometimes  described 
as  expressing  a  notion  of  "independence  of  irrelevant  alternatives," 
with  the  following  meaning:  Suppose  in  the  course  of  selecting  the  most 
preferred  object  from  the  set  S,  the  choice  narrows  to  one  between 
elements  x  and  y.  Then  the  final  decision  is  made  by  considering 
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only  the  merits  of  objects  x  and  y;  the  properties  of  all  other 
elements  In  S  are  "irrelevant’1. 

Analogies  to  Luce’s  choice  axiom  and  its  consequences  will  be 
made  in  the  following. 


I.B.  The  exchange  axioms. 


Like  Luce’s  axiomatization,  the  following  focuses  on  the  behavior 
of  a  single  individual.  The  behavior,  however,  will  consist  of  the 
sorting  of  the  items  of  the  Q-set.  For  the  moment,  and  for  present 
convenience  only,  the  task  will  be  to  completely  order  the  Q-items. 

I.B.l.  Notation. 

Let  I  be  the  number  of  items  and  let  the  items  be  indexed 
1,2,..., I.  Let  tr,  a  permutation,  represent  an  ordering  of  these 
items.  The  k-th  component  of  it,  denoted  ir(k),  is  considered  the 
index  of  the  item  ranked  I-k+1.  Thus, 

7T (1)  *  index  of  the  item  ranked  lowest  , 

tt(2)  =  index  of  the  item  ranked  second  lowest  , 

tt(I)  =  index  of  the  item  ranked  highest  . 

p(7r)  is  the  probability  that  tt  will  occur;  p(tt)  is  called  the  sampling 
function. 

Let  p  be  the  space  of  all  permutations,  tt.  Let  l  and  m  be 
ranks  such  that  £  >  m.  Define  the  operator  T(*;£,m):  p  4  P  as  the 
one-to-one  onto  map  such  that 

T(7r;£,m)  (k)  ■  Tr(k)  for  all  k  ^  £  and  k  ^  m 
T(7T;£,m)  (£)  ■  tt (m) ,  and 
T(Tr;£,m)  (m)  -  tt(£)  . 
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Thus,  t  (tt ;  &,m)  represents  an  ordering  identical  to  that  of  tt 
save  that  the  indices  tt (m)  and  tt(£)  have  been  exchanged.  This  corres¬ 
ponds  to  the  interchange  of  the  two  items,  those  ranked  in  the  Jl-th  and 
m-th  positions,  in  the  ordering  of  the  Q-set  which  tt  denotes. 

I.B.2.  The  first  exchange  axiom  and  its  consequences. 

The  first  exchange  axiom  asserts 


pOO 

p(T  Or;&,m)) 


h(7r(il)  ,7r(m)  ;£,m) 


for  all 
for  all 


1  >  m, 
", 


I.B.2(1) 


with  the  following  interpretation:  The  two  permutations,  tt  and 
T(7T;£,m),  differ  only  in  their  placement  of  the  items  7T (£)  and  tt (m)  • 

Thus,  in  choosing  between  these  two  permutations,  the  decision  intuitively 
ought  not  be  based  upon  the  properties  of  the  other  items.  The  ranks 
of  all  these  other  1-2  items  are  the  same  for  the  two  permutations 
being  compared;  for  this  reason  they  are  "irrelevant . ”  Parallel  to 
Luce’s  choice  axiom,  the  notion  of  "independence  of  irrelevant  alterna¬ 
tives"  is  the  critical  justification  here. 

A  consequence  of  I.B.2(1)  is  that  there  exist  positive  parameters 
{ p ( i) }  and  a  function  S:  {l,...,l}  +  (-00,00)  such  that 


pOO  „  f£MMliSa)‘S(m) 
p(T(TT;fc,m))  lp(7r(m))J 


l.B.2(2) 


(The  essential  point  In  the  derivation  is  the  observation  that  the  ratio 
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A 


p(*0)  p<v 

p(ffj)  p(»2) 

does  not  depend  on  ir^.)  This  statement  similar  in  form  to  that  of 
I. A. 2(3),  where  the  (p(i)}  correspond  to  the  (v(x)}  of  Luce's  model. 
These  parameters  (p(i)}  will  be  referred  to  as  "propensities",  p(i)  the 
propensity  of  item  i  to  being  ranked  highly.  The  function  S(')  acts 
as  a  scaling  function  that  defines  the  "distance'*  between  the  various 
ranks.  Considerable  discussion  will  be  devoted  to  this  scaling  function 
in  the  remainder  of  this  chapter. 

The  following  axiom  gives  some  insight  into  the  role  of  S(*): 

Define  P~*  P  by 

t(tt;£)  (k)  =  7T (k)  ,  for  k  11,  k  ^  £+1 

(J,)  =  7r(jl+l)  , 

t(tt;£)  (£+1)  =  tt( &)  . 

Thus  represents  the  exchange  of  two  adjacent  Q-items,  i.e.  the 

&-th  and  £+l-st  items. 

The  second  exchange  axiom,  which  is  a  specialization  of  the  first, 
asserts  that 


pOO 

p(T(TT;«,)) 


h(ir(&)  ,ir(S.+l))  , 


1.8. 2(3) 


which,  like  the  first  exchange  axiom,  asserts  a  notion  of  independence 
of  irrelevant  alternatives.  But  the  consequences  of  I.B.2(3)  are  more 
severe;  I.B.2(3)  postulates  there  exist  (p(i) }  such  that 


p(tt)  _  p(Tr(fl»l)) 
p(T(ir;S,))  p(ir(£)) 


I. 8.2(4) 
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a  statement  that  Is  remarkably  similar  to  I. A. 2(3)  and  that  constitutes 
a  simplification  of  I.B.2(2).  I.B.2(4)  has  S(A+1)  -  S(£)  -  and 
hence  S(A)  -  c^CA-c^,  c1*c2  arbitrary  constants.  The  role  of  S(*)  then 
is  that  of  a  scaling  function,  as  was  mentioned  above,  measuring  a  sort 
of  interval  of  discrimination  between  the  ranks  ranging  from  1  to  I. 

The  strength,  or  severity,  depending  on  one's  point  of  view,  of  the 
second  exchange  axiom  is  its  assertion  that  all  the  intervals  of  discri¬ 
mination  between  the  ranks  are  of  equal  importance.  This  assumption  is 
generally  not  appropriate  and  will  be  modified  below. 


I.C.  Secondary  axioms  and  properties  of  the  model. 

A  consequence  of  the  first  exchange  axiom  is  that  the  functional 
form  of  p(-)  is  determined  up  to  the  parameters  (p(i)J  and  the 
scaling  function  S(*).  Thus, 


n  p(ir(k))s(k) 

p(TT)  » — I,c'(1) 

i  ,  n  pOr’(k’»i,u  ' 

1  f  1 

k  =1 

where  Z  ,  denotes  summation  over  all  permutations  TTf.  The  denominator 
IT 

simply  ensures  that  the  probabilities  sum  to  unity. 

I.C. 1 .  The  monotone  axiom. 

Note  that  changes  in  scale  in  the  { p ( i) }  are  equivalent 
to  changes  in  location  for  S(*)  while  changes  in  power  in  the 
p{ (i))  are  equivalent  to  changes  in  scale  for  S(*).  In  this 
sense,  then,  S(*)  is  determined  up  to  affine  transformations. 

A  natural  regularity  condition  to  postulate  of  p(ir)  is  for 
the  ratio 


.  s(i+i)-sU) 

p(tt )  m 

p(t(tt;£))  p(u(A)) 

to  be  Increasing  in  p(ir(Jl+l)).  The  rationale  is  the  following: 
If  p(i)  (i“Tr  ( Z+l )  )  measures  some  propensity  of  the  i-th 
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item  to  be  ranked  highly,  then  any  Increase  in  this  propensity 
should  be  reflected  in  an  increase  in  the  likelihood  of  item  i 
being  ranked  over  any  other  item,  in  particular,  over  item  ir(Jl). 

If  the  ratio  I.C.l(l)  is  to  be  increasing  in  p(tt(£+1)), 
then  S(£+l)  -S(Jl)  must  be  positive;  hence,  S(*)  must  be  monotone 
increasing.  For  this  reason,  I.C.l(l)  is  called  the  monotone  axiom. 

The  monotone  axiom  enforces  a  property  that  parallels  one  in 
Luce’s  model,  one  directly  deducible  from  Luce’s  choice  axiom.  This 
property,  strong  stochastic  transitivity,  is  expressed  in  terms  of  the 
pairwise  preference  probabilities 

P(x,y)  *  P^x}  where  A  *  {x,y}  , 

recalling  the  notation  of  section  I.A.  The  property  of  strong  stochastic 
transitivity  says  that 

if  P(x,y)  >  i  (y  is  not  preferred  to  x) 

and  P(y,z)  _>  y  (z  is  not  preferred  to  y) 

then  P(x,z)  max  (P(x,y),  P(y,z)}. 

Objects  such  as  these  pairwise  preferences,  simple  though  they 
be,  are  not  natural  in  the  present  context  of  the  Q-sort  problem. 
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However,  the  analogy  is  interesting.  Note  that  the  monotonicity 

of  the  ratio  in  I.C.l(l)  is  the  same  kind  of  property  as  that  of 

strong  stochastic  transitivity  in  Luce’s  model  —  for  essentially  the 

same  reason.  For  Luce’s  model,  stochastic  transitivity  follows  from 

the  linear  ordering  of  the  objects  that  is  induced  by  v(*).  Similarly, 

for  the  Q-sorting  model,  a  linear  ordering  is  implicit  in  the  scalar 

quantities  (p(i)}.  The  monotone  axiom  ensures  the  linearity  of  this  ordering. 

I.C.2.  The  palindrome  axiom. 

Let  us  denote  the  ranking  that  is  the  reverse  of  that  connoted 
by  tt  as  7r.  Thus 

7r(k)  *  tt  (I— k+1) ,  k  -  . 


The  following  axiom  is  sometimes  reasonable: 


p(tt) 

p(tt’) 


P(Q 

pOO 


for  all  tt  ,  tt  ’  . 


I.C.2(2) 


This  axiom  asserts  the  following  kind  of  invariance:  If  the 
magnitude  of  the  effects  is  reversed,  and  if  the  rankings  that 
empirically  measure  these  magnitudes  is  also  reversed,  no  distor¬ 
tion  in  the  structure  of  the  probabilities  would  occur. 

The  condition  that  I.C.2(2)  imposes  upon  the  scaling  function 
is  that 


S(k)  -  -  S(l-k+l)  ,  k  *  . 
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Such  a  property  is  commonly  called  "rank  reversibility/1  although 
palindrome  invariance  has  recently  been  suggested  (McCullagh  [1978]). 

For  this  reason,  I.C.2(2)  is  called  the  palindrome  axiom. 

Luce's  choice  axiom  has  been  shown  inconsistent  with  this 
concept  of  palindrome  invariar.ee  (Luce  [1959]  ,  Marley  [1968]).  Thus, 
the  axiom  I,C.2(2)  represents  a  qualitative  distinction  between  the 
Q-sorting  model  and  Luce's  model. 

I.C.3.  Axioms  that  determine  the  scaling  function. 

Until  now,  the  sorting  task  has  been  assumed  to  be  that  of 
completely  ordering  the  Q-items,  a  task  very  unrepresentative  of  stan¬ 
dard  Q-practice.  This  restriction  was  made  for  convenience  only  and 
this  section  will  be  devoted  to  relieving  this  restriction.  Key  to 
this  discussion  will  be  the  choice  of  the  scaling  function. 

The  scaling  function  S(*)  allows  for  adaptation  of  the  model 
involving  a  complete  ranking  of  the  Q-items  to  the  more  common  case 
involving  sorting  in  accordance  with  a  forced  distribution.  While  the 
category  sizes  of  the  forced  distribution  can  in  principle  be  accounted 
for  by  a  summation  over  all  compatible  rankings,  such  a  summation  is 
numerically  complicated  to  implement.  In  addition,  such  a  procedure 
fails  to  represent  in  the  model  the  fact  that  the  forced  distribution 
is  an  a  priori,  designed  feature  of  the  sorting  task. 

As  a  simple  alternative,  one  can  represent  the  forced  distribution 
by  equating  the  values  of  the  scaling  function  for  ranks  residing  in  the 
same  category,  thereby  equating  the  ranks  themselves.  Were  the  sizes 
the  categories  5,8,12,16,18,16,12,8,  and  5,  then  S(l)  through  S<5)  would 
have  a  common  value,  as  would  S(6)  through  S(13),  S(14)  through  S(25), 
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and  so  on.  While  this  procedure  does  not  completely  determine  the  form 
of  S(-),  it  does  greatly  reduce  its  complexity. 


By  requiring  the  scaling  function  to  be  constant  for  ranks  that  share 
the  same  category  in  the  forced  distribution,  another  form  of  the  exchange 
axiom  is  motivated,  one  as  restrictive  as  the  second  exchange  axiom  but 
one  that  exploits  the  presence  of  the  forced  distribution.  We  make  use 
of  the  following  notation: 

Let  {C  }  be  mutually  exclusive  and  exhaustive  sets  that  partition 
the  ranks  1  to  I.  Let  the  k-th  of  these  sets,  C^,  correspond  to  the 
k-th  category  of  the  forced  distribution.  For  example,  for  the  forced 
distribution  with  category  sizes  5,  8,  12,  16,  18,  16,  12,  8,  and  5, 
would  be  the  set  {1,2,..., 5},  C ^  the  set  {6, 7,..., 13},  the  set 

{14,15, ... ,25} ,  and  so  on. 

The  notion  of  fixing  the  scaling  function  to  a  common  value  for  the 
ranks  in  the  same  category  of  the  forced  distribution  may  be  represented 
formally  by  the  axiom 


P  (tt) 

p  (t  (tt;  &,m)  ) 


1 


for  all  it,  meC^,  for  all  k, 
and  for  all  tt  . 


I.C.3U) 


The  consequences  of  I.C.3(1)  is  that  for  all  k 

S(it)  »  S(m)  whenever  it,  me  . 

We  shall  refer  to  the  axiom  I,C.3(1)  as  the  first  scaling  axiom. 

I.C.3(1)  is  the  first  axiom  to  be  postulated  that  exploits  the 
use  of  the  forced  distribution.  All  previous  axioms,  in  particular 


the  first  and  second  exchange  axioms,  were  cast  in  the  context  of 
completely  ordering  the  Q-set.  Most  notably,  the  second  exchange 
axiom  was  able  to  largely  determine  the  scaling  function.  The 
following  assertion  resembles  the  second  exchange  axiom  but  exploits 
the  structure  of  the  forced  distribution: 

Let  m  £  and  i  e  The  second  scaling  axiom  asserts  that 

for  all  it  , 


pOO 

p(T  (7T;&,m)  ) 


h(7r(£)  ,7r(m))  . 


I.C.3(2) 


(The  distinction  between  I.C.3(2)  and  the  second  exchange  axiom  is  that 
for  I .C .3(2)  i  and  m  are  restricted  to  adjacent  categories,  while  for 
the  second  exchange  axiom,  £  and  m  were  adjacent  ranks.)  From 
I .C. 3 (2)  it  follows  that  if  there  is  a  k  such  that  meCk  and 
then  S (51)  -  S(m)  is  a  constant.  As  a  consequence  of  I.C.3(2),  the 
scaling  function  is  essentially  determined,  that  is,  determined  up  to 
changes  in  location  and  scale. 

A  reasonable  alternative  to  completely  specifying  the  scale  function 
is  to  estimate  it  statistically.  This  notion  will  be  developed  in  the 
next  chapter. 

X.C.4.  Context  as  defined  by  the  Q-set. 

One  attraction  of  Q  is  the  capacity  to  build  and  enforce 


a  vocabulary;  the  introspection  of  the  sorting  process  can  be 
required  to  be  done  with  reference  to  standard  items.  Stephenson 


(1953)  used  this  idea  to  build  Q-sets  specific  to  particular  psycholo¬ 
gical  theories,  while  Block  (1961)  used  this  idea  in  order  to  transcend 
the  vocabularies  of  particular  theories  of  personality  — •  and  especially 
to  transcend  idiosyncratic  adaptations  of  these  vocabularies.  At  the 
heart  of  this  vocabulary-enforcing  capacity  is  the  idea  of  a  "set",  the 
global  framework  that  personality  and  behavior  inventories  imply  by 
asking  the  questions  they  do.  Curiously,  the  formal  model  expressed 
by  the  sampling  function  in  I.C(l)  has  an  interesting  property  with 
regard  to  this  concept  of  "set". 

Nowhere  in  the  derivation  of  the  functional  form  of  p(#)  are 
the  (p(i)}  defined  except  in  the  context  of  all  I  items  simultaneously. 
Thus,  were  one  to  add  a  single  item  to  the  original  Q-set  or  to  delete 
one  from  the  existing  Q-set,  nothing  in  the  formal  theory  of  the  axioms 
presented  above  would  allow  one  to  infer  that  the  properties  of  the 
original  Q-set  would  be  at  all  similar  to  the  properties  of  the  newly 
constituted  Q-set.  This  formal  non-correspondance  of  almost  identical 
Q-sets  sharply  distinguishes  the  present  model  from  that  of  Luce.  In 
Luce’s  model,  the  choice  axiom  explicitly  restricts  the  manner 
in  which  behavior,  that  is,  the  choice  probabilit ies,  can  change  in 
response  to  the  context,  i.e.  the  selection  of  choices  available.  Thus, 
Luce’s  model  is  a  direct  result  of  assuming  stability  as  the  context 
changes;  the  Q-sorting  model,  by  contrast,  makes  no  such  assumption. 

With  this  point  in  mind,  the  occasional  practice  of  subsampling 
from  a  larger  Q-set  to  ease  the  task  of  the  rater  (e.g.  Jackson  and 
Bidvell  [1959])  requires  —  in  the  frame  of  this  axiomatic  development  — 
an  additional  axiom  to  justify  it.  This  axiom;,  which  would  assert 
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that  p(i)  is  the  same  for  any  Q-set  in  which  i  appears,  could  be 
appropriately  called  "context  irrelevance. "  Further,  if  the  items 
are  chosen  by  explicitly  sampling  from  a  larger  "populat ion"  of  items,  a 
similar  axiom  is  required  if  generalizations  to  the  item  population" 
as  a  whole  are  to  be  made. 

The  assumption  of  context  irrelevance  has  practical  consequences.  One 
can  easily  imagine  that  some  questionnaires  achieve  a  certain  "set"  in 
their  responders  by  asking  certain  questions  in  a  certain  order,  a 
"set"  that  in  turn  can  be  reflected  in  their  responses.  Quite  concei¬ 
vably,  to  ask  more  or  fewer  questions,  different  questions  in  a  different 
order,  would  achieve  a  different  "set"  and  would  result  in  different 
responses.  Thus,  although  the  issue  is  difficult  to  address  experimentally, 
by  no  means  is  it  insignificant. 
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II.  The  statistical  model. 

This  chapter  leaves  behind  the  axiomatic  development  of  the 
sampling  function;  the  focus  is  exclusively  upon  the  form  of  various 
(conditional)  likelihood  functions  of  the  structured  Q-sort.  This 
focus  is  no  doubt  curious  to  some,  for  the  likelihood  functions 
furnish  directly  neither  estimates  nor  inferential  procedures.  Only 
in  chapter  III  will  this  deficiency  be  remedied;  there  estimates  and 
inferences  will  be  derived  using  the  theory  of  maximum  likelihood. 

II. A.  Model  parametr izat ion . 

In  this  section  the  unconditional  likelihood  of  the  structured 
Q-sort  will  be  derived. 

II. A. 1.  Log-linear  parameters  and  duality. 

In  chapter  I  the  sampling  function  p(-)  was  developed  for 
a  single  individual;  it  was  parametrized  by  ip(i)}.  To  extend  this 
sampling  model  to  individuals  j  =  lf...,J,  the  relevant  parameters 
are  {p(i,j)K  This  is  a  very  large  number  of  parameters;  the  present 
task  is  to  reduce  the  dimensionality  of  the  parameters  from  the 
excessive  number  IJ  to  something  smaller. 

For  the  structured  Q-sort,  items  are  constructed  to  represent 
levels  of  various  attributes.  This  structure  is  represented  by  an  I  x  D 
design  matrix,  Q,  whose  rows,  Q^,  are  the  indicators  and  level 
variables  of  the  corresponding  i-th  Q-item.  Q  is  called  the  item  design 
matrix . 
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Let  p(i,j)  =  exp{Q^Sj)  where  g^  is  a  1  x  D  vector  consis¬ 
ting  of  the  parameters  of  the  j-th  subject  and  act  as  the  coefficients 
of  the  D  variables  that  compose  Q. 

Finally,  let  gj  =  gwj ,  where  w^  is  a  1  x  K  vector  consisting 
of  the  covariates  of  the  j-th  subject,  g  the  D  x  K  matrix  of  unknown 
parameters.  W,  the  J  *  K  matrix  whose  rows  are  the  w ^ ,  is  called  the 
subject  design  matrix. 

The  interpretation  of  g  is  intriguing.  As  developed,  the  coor¬ 
dinates  gj  -  gwj  locate  the  j-th  individual  in  the  (dual  ox  the) 
design  space  of  the  Q-set  spanned  by  the  rows  of  the  matrix  Q.  On  the 
other  hand,  y^  =  Q^g  represents  the  i-th  Q-item  in  the  (dual  of  the) 
coordinate  space  spanned  by  the  rows  of  the  matrix  W.  Thus,  g  repre¬ 
sents  each  of  these  two  linear  spaces,  the  item  design  space  and  the 
subject  design  space,  to  one  another  by  its  respective  rows  and  columns. 
This  duality  is  reminiscent  of  the  reciprocity  principle  upon  which 
focused  much  debate  about  Q-  versus  R-  factor  analysis.  (See  Burt  [1972] 


and  Burt  and  Stephenson  [1939].) 

In  this  reparametrization,  the  sampling  function  of  the  j-th  subject, 
becomes 
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II. A. 1(1) 


II. A. 2.  Refinements  in  the  parametrizat ion. 

This  section  will  develop  two  refinements  of  the  log-linear 
parametrizat ion  that  are  directed  at  specialized  concerns.  Both  are 
parsimonious  and  convenient  to  implement.  In  addition,  each  provides 
some  insight  into  the  workings  of  Q,  and  therefore  provides  useful 
criticism  of  the  potential  strengths  and  weaknesses  of  Q. 

a.  Parametrizing  the  scaling  function. 

In  section  l.C.l  the  scaling  function  was  observed  to  be 
determined  up  to  affine  transformations;  in  section  I.C.2  it  was 
postulated  to  be  skew-symmetric .  Because  the  scaling  function 
can  be  interpreted  as  measuring  the  ’’subjective  distances”  between 
the  sort’s  categories,  some  empirical  validation  of  the  scaling 
values  used  might  be  of  interest.  In  particular,  one  might  wish  to 
determine  if  the  discrimination  between  the  extreme  categories  is 
greater  or  less  than  those  between  the  neutral  categories.  Such  an 
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Issue  can  be  formally  considered  by  the  following  association;  If 
the  discrimination  between  the  extreme  categories  is  greater  than 
that  between  the  central  categories,  consider  this  well-expressed 
by  postulating  the  scaling  function  to  be  convex  above  its  median. 

If,  conversely,  the  discrimination  between  the  central  categories 
are  greater  than  those  between  the  extreme  categories,  consider  this 
well-expressed  by  postulating  the  scaling  function  to  be  concave 
above  its  median.  The  following  parametrization  is  then  motivated: 

Sa(k)  =  |s(k)|“  sgn(S(k))  ,  k*  1,2, . . .  ,1 

where  S(k)  is  any  a  priori  skew-symmetric  scaling  function  with  zero 
median.  Employing  power  transformations  to  skew- symmetric  functions 
as  surrogates  to  the  wider  class  of  functions  that  are  monotone,  skew- 
symmetric,  and  convex ( concave) -above-the-median,  this  parametrization  is 
especially  parsimonious.  As  a  side  benefit,  the  monotone  axiom  can  be 
evaluated.  An  estimate  of  a  less  than  zero  would  indicate  monotonicity 
was  being  strongly  violated. 

b.  Modeling  variations  in  raters. 

In  this  section  a  particular  kind  of  variation  between  raters  is 
considered.  This  variation  is  due  to  the  differing  levels  of  effort 
that  different  raters  are  likely  to  make  in  producing  their  sorts. 

For  each  rater  r  let  a(r)  be  the  "acuity"  or  effort  parameter 
of  that  rater.  Let  p(i,j,r)  denote  the  propensity  of  item  i 
being  ranked  highly  when  rater  r  rates  subject  j,  while  p(i,j) 
becomes  some  "underlying"  propensity  of  item  i  being  ranked  highly 
for  subject  j. 
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The  modeling  relation  of  the  acuity  of  rater  r  is 


P(i,j,r)  *  p(ij)  ,  i-l,...,I  . 


For  the  familiar  general  linear  model  the  analogue  to  this  parametriza- 
tion  is  that  the  residual  variation  is  inhomogeneous  and  that  this 
inhomogeneity  can  be  attributed  to  the  raters.  More  importantly,  as  the 
result  of  this  analogy  to  residual  variation,  estimates  of  the  a(r)  can 
be  used  as  indices  of  relative  reliability  that  enable  comparisons  to  a 
"standard"  rater. 


II. A. 3.  The  unconditional  or  full  likelihood. 


Let  the  unconditional  likelihood  function  be  denoted  by 
eLW).  Then, 


f/ox  J  exp[q(Tr  )$w'} 

>(B)  =  n  p.(TT)  =  IT  - J - 3 - 

j*l  J  j=l  exp{q(Tr’)gwj} 


II.A.3  (1) 


where  now  denotes  the  permutation  observed  as  the  response  of  the 
j-th  subject.  (Note  that  tt^  ,  which  connotes  a  full  ranking  of  the 
I  items  of  the  Q-set,  breaks  ties  arbitrarily.  However,  q(TTj) 
invariant  to  the  manner  in  which  the  ties  are  broken.) 

The  rightmost  expression  in  II. A. 3(1)  is  uncomputable  for  even 
moderate  I  as  it  requires  summations  over  all  possible  permutations 
of  I  objects.  To  avoid  this  problem,  the  remainder  of  this  chapter 
considers  various  conditional  likelihoods,  each  of  which  is  potentially 


computable. 
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II.  B.  Model  simplification. 


Recall  that  the  full  likelihood  was  computationally  intractable  be¬ 
cause  the  denominator  of  each  p^Crr)  contained  a  very  large  number  —  I!  — 
of  terms.  A  natural  simplification  is  to  limit  the  number  of  terms  in  these 
denominators.  The  issue  is  then  to  determine  the  criterion  by  which  these 


terms  are  chosen.  Note  that  if  one  of  these  terms  in  the  denominator  of 


p  (tt)  is  the  numerator  itself,  two  benefits  accrue.  First,  p  (tt  )  is 

J  J  J 

bounded  above  by  unity,  thereby  remaining  a  proper  probability  measure. 
Second,  p j  (tt '  )  can  then  be  interpreted  as  a  conditional  probability. 

The  motivation  here  to  employ  conditional  likelihoods  is  atypical. 

More  commonly,  conditional  likelihoods  are  employed  to  eliminate  so-called 
incidental  parameters  (Neyman  and  Scott  [1948])  whose  estimation  would 
otherwise  consume  too  many  degrees  of  freedom.  Alternatively,  conditional 
likelihoods  are  sometimes  explicitly  induced  by  conditioning  on  those 
statistics  that  have  no  apparent  relevance.  (See,  for  example,  Godambe 
[1980].)  Although  computational  simplicity  is  sometimes  listed  as  one  of 
the  virtues  of  conditional  inference,  it  is  usually  subordinate  to  another 
criterion.  Here,  however,  it  must  be  pre-eminent. 

In  conceding  this  pre-eminence,  the  choice  of  the  form  of  the  condi¬ 
tional  likelihood  is  largely  undetermined.  This  section  presents  two  kinds 
of  conditional  likelihoods.  The  first  exploits  the  special  structure  that 
Stephenson  built  into  his  Q-sorts  —  a  completely  balanced,  cross-factorial 
design.  The  second  applies  to  more  general  Q-dcsigns. 
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II.B.l.  The  balanced,  cross-factorial  designs. 


Stephenson's  Q-sets  all  had  the  following  design:  First, 
he  would  characterize  a  theory  (of  personality,  of  aesthetics,  etc.) 
by  F  factors,  the  f-th  of  which  would  have  levels.  Then, 

with  an  *  • • •  x  design  in  mind,  he  would  develop  a 

certain  number  of  Q-items  to  be  representative  of  the  traits  that 
characterize,  according  to  the  theory,  each  cell  in  this  cross-factorial 
design.  When  each  cell  is  represented  by  an  equal  number  of  Q-items 
the  item  design  matrix  is  said  to  be  balanced.  Kerlinger  (1972)  gives 
a  succinct  but  complete  description  of  the  ways  in  which  Stephenson 
used  these  balanced,  cross-factorial  designs. 

p 

For  clarity  of  presentation,  let  us  consider  2  x  •••  x  2  *  2 
designs  first.  The  convenience  of  doing  this  derives  from  the  fact  that 
in  this  case  D  =  F.  For  F  *  3,  the  Q  matrix  would  have  the  typical  rows 

111 
11-1 
1-11 
1  -1  -1 

-111 
-1  1  -1 

-1  -1  1 

-1  -1  -1 

Thus  each  row  represents  a  vertex  of  a  cube  whose  center  of  mass  is 
at  the  origin  (0,0,0).  Let  the  conditional  likelihood  for  the 

£bc<9) 

balanced  Q-sort,  e  ,  then  be 


tBC<« 


J  j  exp{q(Tr^)£wj}  J 

jUj.  jig  expiq(TT^)SBwj)  | 


II. 
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.1(1) 


where  I  denotes  summation  over  all  diagonal  matrices  S  of  order 
S 

F  with  +1  and  -1  being  the  only  diagonal  entries  available.  Recall 
that  q(7ij)  are  t*ie  marginal  Q-scores  of  the  D  dimensions  of  the  item 
design  matrix.  Thus, 


q<V  *  s(k)  ^(k>  • 


The  above  form  corresponds  to  conditioning  on 


IqOOl  =  (Iq^lTj)!.  Iq2(^)|  Iq^)]) 


Geometrically,  this  corresponds  to  conditioning  on  the  event  that 
q(7T_j)  is  one  of  the  vertices  of  the  right  rectangular  prism  whose 
center  of  mass  is  at  the  origin  and  whose  vertices  are 
(+  |q^(*n  )|f  +  i  q2^Tr  j)  »•••*  —  The  rationale  for  condi¬ 

tioning  as  above  is  the  following:  (1)  The  vertices  {q(Tr^)s}  that 
compose  the  orbit  of  the  conditioning  have  centroid  at  the  origin  and 
in  that  sense  do  not  ’’bias"  the  likelihood  in  any  particular  direction. 

(2)  The  vertices  { q (tt^ ) S }  span  the  design  space  whenever  q^TT^)  ^  0 
for  all  f  -  1,...,F;  that  is,  these  vertices  span  the  design  space 
whenever  it  is  intuitively  reasonable  that  it  should,  and  otherwise  they  span 
the  linear  subspace  of  the  design  space  that  holds  the  information  that 
q(7T_j)  indicates  is  available. 

Let  us  now  consider  the  case  where  the  Q-design  is  of  the 
L1  X  L2  X  **'  X  LF  type*  Denote  by  q^(TTj)  the  vector 


(qflCffj),  q^^j)  »••*>  q^  » 


iwr 
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qf^(Trj )  ^eing  t^e  marginal  Q-score  of  the  j-th  subject  for  the 
f-th  factor  at  the£-th  level.  If  we  adopt  the  restriction  that 
holds  the  scaling  function  to  be  skew  symmetric  about  (1+1) /2  (the 
palindrome  axiom),  then,  for  each  f. 


l  =  0  .  II. B.  1(2) 


This  redundancy  requires  that  D  =  E  (L^-l) ,  if  Q  is  to  be  of  full 
rank  D,  This  is  not  convenient,  however,  so  we  shall  use  the  redundant 
item  design  matrix  Q  with  D  =  Z^Lf,  but  of  rank  Z^(L^-l). 

IT 

We  generalize  the  conditional  likelihood  of  the  2  balanced 
factorial  design  by  considering  all  permutations  of  the  elements  in 
qf(TTj).  Thus, 


J 

n 

j-i 


f  exp{q(TT.)8w!} 

l — - J _ J _ 

H  Ip  exp{q  (if  )P  8  wj 
(  f35!  f  j  j 


II. B. 1(3) 


where  6  *  (8^ , . 
•  •  •  * 

of  order 


•  •  *  8  ^  $  •  •  •  *  8p)  and  q  (?r  j )  r  ^  *^1  j  ^  (tt  j  )  > 

denotes  summation  over  all  permutation  matrices 
f 

Because  of  the  restriction  II. B. 1(2),  the  constraints 


I 

i=i 


f£ 


0  , 


where  8^  are  the  columns  of  8^,  are  necessary  for  8  to  be 

identified.  The  utility  of  II. B. 1(3)  is  that  the  denominator  is  computable. 
F 

As  for  the  2  design,  a  geometric  interpretation  of  the 
orbit  of  conditioning  is  possible  here.  For  example,  when  F  •  2, 

■  2  and  *  3,  the  vertices  that  compose  the  orbit  of  the 
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conditioning  correspond  to  the  corners  of  a  right  six-sided  prism 
whose  two  hexagonal  faces  are  generated  by  the  permutations  of  the 
three  levels  within  the  second  factor,  each  of  the  two  faces  corres- 

p 

ponding  to  a  level  of  the  first  factor.  As  for  the  2  design,  the 
vertices 


{q(7T^)P  :  P  a  permutation  of  levels  within  factors} 


compose  an  orbit  whose  centroid  is  the  origin.  Also,  if  the  elements 
qf(TTj)  are  not  identical  for  f  =  1,...,F,  these  vertices  form  a  basis 
which  spans  the  whole  design  space. 

The  information  that  is  suppressed  by  conditioning  as  above 
deserves  discussion.  Key  is  the  observation  that  the  design  space  is 
spanned  only  when  the  elements  within  each  are  not  identi¬ 

cal.  Suppose  that  for  all  subjects  that  the  elements  qf  (tt  )  are 

0  J 

identical  (zero).  While  one  might  wish  the  corresponding  matrix  8f 

ro 

to  be  estimated  by  a  null  array,  in  fact  8  is  not  identifiable  in 

C  (8)  ° 

e  .  If,  now,  only  the  jQ-th  subject  has  non-identical  qf 

one  would  anticipate  that  the  preponderance  of  null  qf  (tt  )fs  would 

r0  3 

push  any  reasonable  estimate  of  8-  toward  the  null  array.  In  fact, 

*0 

in  minimizing  Cnr(B)  with  respect  to  8-  ,  only  subject  j's 

dL  I  a  U 


scores  would  contribute;  adding  additional  subjects  with  qf  (tt  )  «  0 

*0  3 

would  not  dilute  the  effect  that  subject  Jq?s  scores  would  have  on 

the  estimate  of  8,  ,  while  deleting  subject  jQ  would  make  3f  uniden- 

f0  0 
tillable.  For  thio  reason,  one  might  characterize  the  information  upon 


which  e 


is  conditional  as  the  "magnitudes  of  the  effects,"  that 


the  information  that  remains  is  in  the  "directions  of  the  effects." 

Of  course,  this  description  lacks  precision;  nevertheless,  the  manner 
in  which  a  subject  possessing  especially  large  Q-scores  on  a  particular 
factor  can  sway  the  conditional  maximum  likelihood  estimates  provides 
one  motivation  for  considering  other  kinds  of  conditional  likelihoods. 


II. B. 2.  The  unbalanced  Q-set . 

The  conditional  likelihood  in  II. B. 1(2)  can  be  unsatisfactory 

Cbc<3) 

for  two  reasons.  First,  the  likelihood  e  is  only  possible  in 

principle  when  the  Q-set  is  structured  as  a  completely  balanced,  cross¬ 
factorial  design.  Not  only  are  certain  Q-sets  not  completely  of  this 
type,  lacking  balance  for  instance,  but  often  even  for  such  carefully 
designed  Q-sets  one  desires  to  append  to  the  item  design  matrix 
certain  nuisance  covariates,  e.g.  measures  of  the  items'  social  desira¬ 
bility,  concept  complexity,  vocabulary  level,  etc.  (Sundland  [1962]) 
or  to  consider  interactions  among  the  main  factors.  Thus,  a  conditional 
likelihood  applicable  to  an  arbitrary  item  design  matrix  is  desirable. 
Second,  as  noted  in  the  previous  section,  for  some  data  sets  with 
certain  Q-matrices,  e  can  behave  unsat isf actor ially,  by  over 

emphasizing  the  contributions  of  certain  subjects.  In  these  same 
instances,  numerical  problems  in  the  maximization  process  may  result. 

In  this  section  we  shall  develop  a  conditional  likelihood  that  makes 
use  of  a  representation  that  is  dual  to  that  of  a  permutation. 

Def inition.  Let  a,  a  ranking ,  represent  the  ordering  of  a  Q-deck 
such  that 
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a(i)  *  rank  of  the  i-th  item  in  the  Q-sort. 

Note  that  for  a  given  Q-sort  represented  by  a  and  tt,  a(Tr(k))  *  k  and 
7T(a(i))  -  i.  So  a  and  7T  are  inverses  of  one  another.  We  shall 
denote  the  range  of  a  by  N(I),  and  refer  to  it  as  the  rankings  space. 

Definition.  A  shuffle,  w:N(I)  N(I),  is  a  one-to-one  onto  map  on 
the  ranking  space  N(I)  that  can  be  represented  by  a  function 
tt:  {l,2,...,l}  +  {1,2, . ..,l}  that  is  one-to-one  and  onto  such  that 

w(c  ’  (i)  =  tt(a(i))  . 

We  refer  to  tt  as  the  shuffler . 

A  shuffle  operates  to  rearrange  the  ordering  of  a  Q-set  in  a  parti¬ 
cular,  systematic  way.  One  can  visualize  a  shuffle  as  an  automatic  card 
shuffling  machine.  Its  argument  is  the  input  Q-sort;  its  result  is  a 
reshuffled  deck.  More  importantly,  a  shuffle  ignores  the  indices  (i) 
of  the  cards  and  operates  only  on  their  ranks. 

Definition.  The  compos  it  ion  (o)  of  two  shuffles,  and  w2>  with  corres¬ 
ponding  shufflers  tt^  and  tt^»  is  such  that 

wx  o  w2(a)(i)  =  tt1(tt2(a(i)))  =  w1(w2(o))(i)  . 

By  this  definition,  the  sorted  deck  w^  o  w2(a)  results  from  shuffling 
a  by  W2  and  then  shuffling  that  result  by  w^. 

Note  that  each  shuffler  tt  induces  a  semi-group  on  N(I)  whose 
corresponding  shuffle  is  the  generator.  This  brings  us  to  our  next 
construct. 
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Def init ion.  Let  £2  be  a  set  of  shufflers  whose  corresponding  set 
of  shuffles  is  closed  under  composition.  A  random  shuffler  is  the  proba¬ 
bility  space  whose  sample  space  is  £2  that  assigns  equal  probability  to 
each  element  of  £2. 

Intuitively,  £2  could  be  seen  as  an  advancement  in  card  shuffling 
technology  over  the  simple  shufflers,  {(o) .  Such  a  machine  has  at  its 
disposal  several  shufflers,  and  for  any  particular  task  it  chooses  one  of 
these  shufflers  at  random.  More  importantly,  if  one  recycles  the  output 
through  the  random  shuffling  mechanism  once  again,  the  probability  distri¬ 
bution  induced  on  N(I)  remains  unchanged. 

This  set  of  definitions  is  now  used  to  simplify  the  full  likelihood. 

We  propose  to  condition  on  a  random  shuffler.  By  this  scheme,  any 
pj  (iTj  )  is  then  of  the  form 

IEfc-i  s(k)V(k),s  “j> 

P.(tt.)  =  - = - J -  II.  B.  2(1) 


The  terms  {£*  S(a)(k))Q  /,>.•  u>e£2}  are  then  the  orbits  with  respect  to 

k  -1  IT,  v 

J 

which  the  conditioning  is  made.  To  emphasize  the  central  role  £2  plays 
in  the  creation  of  these  orbits,  and  to  reinforce  the  sobriety  of  our 
endeavor  ,  £2  ,  the  sample  space  of  the  random  shuffler  shall  be  termed  the 
orbit  generator  in  all  of  the  following. 

We  now  turn  our  attention  to  specific  properties  desired  of  ft. 


Definition.  An  orbit  generator  ft  is  unbiased  with  respect  to  S 
if 


-h  l  S(w(r))  =0  for  r  =  1,2, ...,I  , 

1  coefl 
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where  #Q  denotes  the  number  of  elements  of  ft.  Thus,  SI  is  unbiased 
if  the  centroid  of  the  scores  S(oi(r))  is  the  same  as  the  mean  of  the 
scaling  function  S(*). 

Definition.  An  orbit  generator  ft  is  said  to  contain  its  reversals 
if 

whenever  co  e  SI  then  go  e  ft  , 

where  o>(r)  =  a>(I-r+l) .  Clearly  any  orbit  that  contains  its  reversals  is 
unbiased  with  respect  to  skew- symmetric  scaling  functions. 

The  smallest  unbiased  orbit  generator  for  skew- symmetric  S(*)  is 
fig  *  {e,e},  where  e  is  the  identity  shuffler  and  e  is  its  reverse 
(e(r)  *  r,  e(r)  *  I-r+1) .  A  natural  criticism  of  is  that  it  is  too 

sparse:  Consider  the  vectors  S  ,  such  that  S  (r)  =  S(co(r)).  Then  of 

the  ( I- 1) -dimensional  space  in  which  resides  the  (S^r  all  0)}  ,  spans 

only  a  one-dimensional  subspace.  With  this  objection  in  mind,  the  following 

classes  of  generators  are  proposed. 

For  notational  convenience,  rather  than  labeling  the  ranks  by  the 
numbers  1,...,I,  we  label  them  by  the  numbers  0,1,..., 1-1.  Also  we  denote 
a  shuffler  explicitly  by  an  I-tuple.  Thus,  go  =  (oo(0)  ,oo(l)  , . . .  ,go(I))  ,  and 
oo (r)  connotes  the  rank  to  which  the  shuffler  od  moves  the  item  that  bore 
rank  r . 

Consider  the  following  table: 
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6 _ 7 

0  0 
6  7 

4  6 

2  5 

0  4 

6  3 

4  2 

2  1 

This  table  is  the  multiplication  table  modulo  8.  Note  also  that  rows 
1,  3,  5,  and  7  are  permutations  (or  shufflers),  while  rows  0,  2,  4,  and 
6  repeat  digits  and  so  are  not  shufflers.  (Not  coincidentally,  0,  2,  4, 
and  6  have  factors  common  to  8.)  For  each  shuffler  from  this  multiplica¬ 
tion,  that  is,  for  each  of 

01234567 

03614725 

05274163 

07654321 

consider  its  cyclics.  For  01234567  they  are 

01234567 
70123456 
67012345 
56701234 
45670123 
3  4  5  6  7  0  1  2 

23456701 
12345670 

Notat ionally ,  the  cyclics  of  a  shuffler  u)  are  the  compositions  o>,  cu). 


36 


2  T— 1 

cco)  *  c  (o,«..,c  co  where  c  »  (12  3  •••  1-1  0),  Intuitively,  the 
shuffler  c  takes  the  item  at  the  top  of  the  deck  and  places  it  on  the 
bottom. 

The  shuffles  that  are  the  rows  embedded  in  the  multiplication  table 
modulo  I  are,  in  general, 

co  *  (0,  l*p,  2 •  p  (I-l)*p) 

P 


where  p  is  relatively  prime  with  respect  to  I  and  where  the 
operation  ,,#,t  is  multiplication  modulo  I.  Define  ft^  by 

ft  =  (co:  a)  =  c^co  ,  k^O, 1, . . . , I-l)  ; 

P  P 

thus  is  the  set  of  all  cyclics  of  co^.  Finally,  define 

ft'  =  U1  ft 

P  P 

where  denotes  union  over  all  indices  p  relatively  prime  to  I. 

Proposition.  ft1  is  closed  with  respect  to  the  composition  operation. 
Also,  Z'  *  (S^:  coeft1}  spans  the  (1-1) -dimensional  linear  subspace  that 

holds  the  full  lattice  {S  :  all  go)  . 

0) 

Geometrically,  ft*  being  closed  with  respect  to  composition 
translates  to  the  elements  of  Z 1  being  vertices  of  some  regular 
polyhedron  in  the  (1-1) -dimensional  space  of  all  rankings. 

That  the  elements  of  Z 1  span  this  space  implies  that  this  polyhedron 
is  "solid”  in  the  (I-l)-dimcnsonal  space.  Uith  these  two  properties, 
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£'  intuitively  satisfies  the  requirement  that  the  chosen  subset  of 

(S  }  be  evenly  scattered  over  the  surface  of  the  (I-l)-sphere.  We 

0)  3 

shall  call  ft'  the  cyclic  prime  orbit  eenerator,  and  call  the  likeli¬ 
hood  ft'  induces  in  II. B. 2(1)  the  cyclic  prime  likelihood,  denoted 

£rn(g) 

e  Cp  . 

The  above  proposition  lias*  i he  fallowing  generalization.  Let  be 

a  subset  of  the  relative  primes  of  I  such  that  if  and  are 

elements  of  H,  so  also  is  (multiplication  modulo  I).  Unity 

is  also  an  element  of  II.  Define  co  as  before  by 

P 

0)  =  (0,lp,2p, . .  . ,  (I-l)p)  for  all  p£u  , 

P 

and  define  ft  ,  where  q  is  any  number  which  shares  a  factor  of 

pq 

I  (q  may  equal  unity) ,  by 

ft  *  {ou:  oj=ckqo)  ,  k  =  0,1,2, ...  ,1-1}  . 
pq  P 

Then  the  following  generalization  is  true. 

Proposition.  ft(H,q)  =  U__r  ft_  is  closed  with  respect  to  the  composition 

pCii  p^i 

operation* 

This  proposition  allows  us  to  consider  orbits  smaller  than  those  induced  by  ft 

One  should  note  that  in  simplifying  the  likelihood  to  this 
conditional  form,  an  element  of  arbitrariness  has  been  introduced 
that  was  not  present  previously.  Whereas  the  full  likelihood  was 
invariant  to  the  manner  in  which  ties  were  broken,  as  was  the 
balanced,  cross-factorial  likelihood,  those  likelihoods  based  upon 
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orbit  generators  are  typically  not  so  invariant.  Unfortunately,  to 
enforce  this  invariance  on  this  class  of  likelihoods  would  add 
considerable  computational  effort,  running  counter  to  the  primary 
motivation  for  considering  this  class  of  estimates. 
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III.  Asymptotic  properties  of  likelihood  analyses  in  conditional  problems. 

III. A.  Maximum  likelihood  estimation  and  maximum  conditional  likelihood 
estimation. 


All  results  of  consistency*  uniqueness,  and  asymptotic  normality 
follow  from  specialization  of  results  due  Andersen  (1970),  and  are  parti¬ 


cularly  easy  because  p,(*) 


is  an  exponential  family  parametrized  by  8. 


III.A.l.  Consistency  and  uniqueness. 

In  chapter  II,  focus  was  exclusively  upon  various  likelihood 
functions.  This  focus  may  seem  curious  to  some,  for  the  likelihood 
functions  do  not  furnish  us  directly  with  either  estimates  of  8  or 
with  inferential  procedures.  We  now  close  this  gap.  Estimates  of  8 
are  obtained  by  using  that  value  of  8  that  maximizes  whichever  likeli¬ 
hood  is  convenient.  The  estimation  equations  are,  for  the  full  likeli¬ 
hood. 


J 

Vfi  CCS)  =  0  =  l  tqOr  )«'  -£_{q(Tr)w’iS-} 

P  J  J  M  J 

where  wj;B)  «  pj  (tt)  q(7i)w_.f,  the  expectation  relative 

to  the  distribution  induced  by  p  (tr) . 

The  various  conditional  likelihoods  have  estimation  equations 
of  a  similar  form.  For  example,  the  cyclic  prime  likelihood 
has  the  equations 


vft£  <B)  «  o 

P  cp 


l  (qOOv!  -  £  {qOOw’lB)) 

J  J  "  »CP  J 
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V6  C(6)V' 


J 

•  - 1  Cov.  {q(ir)w!,  wq(tr)'} 
j“l  **  J  J 


from  which  we  may  infer  that  the  maximum  is  achieved  and,  if  well- 
identified,  unique,  due  to  the  positive  definiteness  of  -Vg  C(8)Vg  • 

(Also,  if  -Vn  JC(6)V^  is  only  positive  semi-definite,  then  (3  is 

P  p 

not  fully  identified;  constraining  6  in  the  proper  way  will  identify 
it,  whence  £(8)  has  a  unique  constrained  maximum.) 

For  reasons  related  directly  to  the  uniqueness  of  the  solution 
to  the  maximum  (conditional)  likelihood  equations,  it  follows  that  as 

A 

J  becomes  large,  the  solution,  6,  converges  to  the  true  value  8. 

III. A. 2.  Asymptotic  normality. 

Let  us  denote  the  Fisher  information  matrix  with  respect  to 
the  likelihood  £(•)  by  I  j^(B)  and  define  it  by 


Jl'f(S) 


Cov  j.{q(ir)wj  ,v^q(iT) '} 


where  £,  is  whichever  (conditional)  likelihood  is  being  employed  and 
where  Cov  ^{•,*}  is  the  covariance  with  respect  to  £,  . 

1/2  A 

Then  from  Andersen  (1970),  it  follows  that  J  (8-8)  Is  distributed 


#i(0,I  ^ 


(8))  as  J  become  large; 


simple  inferences  may  be  made  on  this 


basis. 
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III.B.  Generalized  likelihood  ratio  hypothesis  testing. 

A 

While  the  property  of  3  being  asymptotically  normal  may  be 
exploited  directly  to  form  tests  of  hypotheses,  the  generalized  likelihood 
ratio  statistic  (glr)  is  usually  more  convenient.  This  is  because  the 
glr  is  available  as  a  direct  consequence  of  the  maximizing  algorithm.  The 
general  form  of  this  statistic  is  as  follows. 

Consider  the  hypotheses 

nQ:  $  z  ft 0  vs  Hj!  3  e  ^  . 

The  conditions  which  we  adopt  are  that  (a)  5^  is  a  subset  of  and 

(b)  contains  no  subset  that  is  an  open  set  in  Under  these  con¬ 

ditions,  and  for  our  model  the  statistic 

max  exp{>C(8)} 

glr(Ol)  max  expT£(8  77 
$e  'Hl 

has  nice  properties.  In  particular,  the  asymptotic  distribution  may  be 

A 

derived  by  using  the  fact  referenced  in  the  previous  section  that  3  is 

normally  distributed.  The  asymptotic  distribution  of  glr (01)  is  such  that 

2 

-2  log  glr(01)  is  approximately  x  with  degrees  of  freedom, 

where  is  the  dimensionality  of  hypothesis  We  shall  use  the  above 

result  for  the  solution  of  the  hypothesis  testing  problems  posed  in  the 
next  chapter. 
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IV,  Problems  of  applied  Interest, 

This  chapter  considers  two  kinds  of  applied  problems  that  naturally 
arise  in  the  context  of  the  structured  Q-sort.  The  first  of  these,  the 
testing  of  nested  hypotheses  to  assess  the  significance  of  sets  of 
8-coefficients,  has  a  particularly  simple  form.  The  simplicity  of  this 
theory  results  not  from  any  special  properties  of  the  Q-sort  model  but 
from  the  well-known  theory  of  (conditional)  generalized  likelihood-ratio 
tests.  An  example  is  considered  in  detail  to  establish  the  appropriateness 
of  this  theory. 

In  the  latter  part  of  this  chapter,  a  class  of  hypotheses  is  described 
that  falls  outside  the  natural  domain  of  the  generalized  likelihood  ratio 
tests.  Precisely  because  these  hypotheses  are  central  to  Stephenson's 
structured  Q-sort  methodology,  special  attention  is  required  to  develop 
an  appropriate  test. 

IV. A.  Testing  nested  hypotheses. 

Before  proceeding,  the  following  change  in  notation  is  convenient. 

In  chapter  III,  ft  ,  i  =  0,1,...,  denoted  subspaces  of  the  parameter 

X/ 

space  to  which  belonged  8.  We  replace  this  convention  with  another. 
Hereafter,  let  denote  sets  of  index  pairs.  Thus,  a  hypothesis 

can  be  of  the  form 


H^:  8^  *  0  for  all  dke  ^  . 

Although  these  classes  of  hypotheses  are  less  general  than  those  of 
chapter  III,  they  are  sufficient  for  most  practical  problems. 
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The  general  problem  upon  which  we  focus  is  the  test  of  the  hypothesis 


V  edka  0  for  dke^ 

versus 

Hi :  Bdk  *  0  for  d  , 

where  the  key  regularity  condition  is  that  ^  is  a  proper  subset  of 
If  we  let  g  =  //  this  condition  is  sufficient  to  show  that 

-2  log  Xqi  is  asymptotically  distributed  as  a  chi-square  with  g  degrees 
of  freedom,  where  A^  is  defined  by 

sup{exp{X^(3)};  &dk=0,  dkefty 
sup(exp{l.^(8) }  :  3dk=0,  dke/^} 

One  example  for  which  such  a  hypothesis  might  be  formulated  is  the 
following: 

Very  often  in  the  development  of  a  Q-set,  the  matrix  Q  is  chosen  and 
fixed;  only  then  are  the  Q-items  formulated.  The  most  notable  instances 
of  this  procedure  are  the  balanced,  cross-factorial  designs  Stephenson 
built  into  his  Q-sets.  A  primary  criticism  of  this  procedure,  articulated 
by  Sundland  (1962)  ,  is  that  no  guarantee  can  be  made  that  the  items  are 
actually  sorted  in  response  to  those  properties  that  led  to  their  choice 
in  the  first  place.  In  particular,  the  rater  may  be  reacting  primarily 
to  the  social  desirability  .>f  the  items,  or  their  conceptual  complexity, 
or  an  Interaction  between  such  features.  Because  of  these  concerns,  it 
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may  be  desirable  to  augment  the  item  design  matrix  with  additional  columns 
that  represent  such  "nuisance"  covariates.  Once  incorporated,  the  coeffi¬ 
cients  of  these  covariates  can  then  be  evaluated  to  determine  if  they  are 
significantly  associated  with  the  propensity  of  any  item  to  being  ranked 
highly.  The  form  of  the  null  hypothesis  is 

Hq;  Sdk  =  0,  d  *  D^,...,D2  and  all  k 

where  the  rows  D  , would  represent  such  "nuisance"  covariates;  the 
natural  alterant ive  hypothesis  is  one  where  is  an  empty  index  set. 

The  above  null  hypothesis  may  lack  power  by  being  too  global.  While  the 
raters  may  be  responding  to  an  unintended  concomitant  feature  of  the  items, 
they  may  be  less  likely,  if  well-trained,  to  confound  the  error  by  reacting 
to  this  feature  in  different  degrees  with  different  subjects.  Motivated 
by  this  consideration,  a  hypothesis  intermediate  between  and  the 

general  alternative  of  the  form 


H':  Bdk  -  0,  d  -  Dx . 02  and  k  =  1, 


(where  w  ^  36  1,  the  constant  part  of  the  predictor  space),  could 
represent  an  a  priori  direction  that  would  successfully  concentrate  the 
power  of  test  of  Hq.  In  this  scheme,  the  sequence  of  hypothesis  tests 
Hq  vs.  Hf  and  then  Hf  vs.  provides  a  stepwise  procedure  with  the 

potential  of  greater  acuity  than  that  which  would  result  from  simply  testing 
Hq  vs.  globally.  (An  incidental  but  convenient  property  of  this  form 

of  the  stepwise  procedure  is  that  the  two  tests  of  which  it  is  composed  are 
Independent . ) 
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IV. B.  Testing  validity  by  using  a  design  that  stratifies  subjects. 

In  the  previous  section,  the  standard  theory  of  generalized  likeli¬ 
hood  ratio  testing  was  sketched.  Recall,  however,  that  the  example 
included  consisted  of  the  consideration  of  a  relatively  peripheral  issue  — 
the  significance  of  nuisance  covariates.  The  choice  of  this  example  is 

not  accidental.  The  central  hypotheses  of  Q-studies  are  usually  not  of 
the  general  form  of  the  previous  section;  rather  the  sign  of  the  coeffi¬ 
cients  is  usually  specified  in  the  alternative.  This  is  because  the  samples  of 
structured  Q-sort  studies  are  often  configured  by  deliberately  choosing 
certain  kinds  of  subjects.  Stephenson  proposed  selecting  indivi¬ 
duals  with  characteristics  that  could  be  theoretically  predicted.  If 
their  Q-sorts  did  not  correspond  well  with  the  predictions  of  the  theory, 
the  theory  was  considered  invalidated. 

As  an  example  of  this,  consider  a  Q-set  representing  the  typology 
of  Spranger  (1928)  that  partitioned  persons  into  the  types:  religious 
political,  theoretical,  economic,  aesthetic,  and  social.  The  idea  is 
then  to  test  this  Q-set  upon  clerics,  whose  value  systems  one  would  expect 
to  be  religious,  politicians,  whose  value  systems  one  would  expect  to  be 
political,  academics,  bankers,  artists  and  bartenders,  each  subject  to  a 
natural  a  priori  classification.  If  the  Q-sort  shows  clerics  not  being 
particularly  religious,  politicians  not  particularly  political,  and  so  on, 
then  the  most  natural  conclusion  is  that  the  Q-set,  the  instrument,  is  no 
good,  and  that  by  inference,  the  theory  the  Q-set  represents  is  invalid. 

Formally,  this  problem  may  be  presented  by  hypotheses  of  the  form 

Hn;  B. t  >  0  ,  d  k  e  ft  vs.  H.  :  6  arbitrary,  d  k  e  9^  , 

U  dk  1  dk 
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where  R  is  a  designated  index  set.  The  generalized  likelihood  ratio 
test  is  not  appropriate  as  it  stands  because  the  dimensionality  of  the 
null  and  alternative  hypotheses  is  the  same. 

The  following  modification  makes  generalized  likelihood  ratio 

hypothesis  testing  feasible:  Under  both  Hq  and  Hj  restrict 

B_.  =0,  dk  e  4^  >  and  0  unknown.  Then  with 
dk 

H^:  Bdk  =9,  dk  e  9  >  0  vs  Hj:  Pdk  =  9,  dk  e  9  <  0  , 


we  obtain  a  structure  where  an  assessment  of  this  one-sided  hypothesis  is 
easily  made. 

The  reparametrizing  of  and  says  that  the  degree  to 

which  a  cleric  is  religious  is  the  same  as  the  degree  to  which  a 
politician  is  political  and  the  degree  to  which  a  banker  is  economic. 

This  seems  not  unreasonable.  One  may  wish  to  test  this  hypothesis, 
however,  and  the  following  test  is  independent  of  the  latter.  Let 
Hq  and  be 

H0:  6dk  =  0’  dk  e  ^  *  -00  <  9  <  «  vs.  :  0dk  arbitrary. 

In  practice  one  might  wish  to  test  H”  and,  if  accepted,  test  H^. 
Rejection  of  either  case  can  be  construed  as  evidence  against  validity. 
The  interpretation  of  the  rejection  differs  between  the  two  cases. 
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however.  Hq  postulates  the  lack  of  interactions  that  might  otherwise 
confound  the  test  of  the  main  effect;  its  rejection  would  imply  the 
presence  of  such  interactions.  postulates  the  direction  of  the 

main  effect;  its  rejection  invalidates  the  theory  that  was  the  basis 
of  the  construction  of  the  Q-set. 
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V.  Analysis  of  the  unstructured  Q-sort. 

In  chapters  II,  III,  and  IV,  the  item  design  matrix  Q  and  the 
subject  design  matrix  W  were  always  assumed  known.  This  was  the  case 

of  the  structured  Q-sort.  The  Q-set  had  an  a  priori  structure,  as  did 

the  subjects;  the  problem  was  the  manner  in  which  these  two  structures 
related.  The  estimate  of  the  matrix  8  represented  a  solution  to  this 

problem.  In  this  chapter  at  least  one  of  the  design  matrices,  Q  or 

W,  is  unknown  and  some  reasonable  estimates  of  them  is  desired.  The 
form  of  8  ,  on  the  other  hand,  is  no  longer  of  Interest;  indeed,  because 
it  is  unidentifiable,  the  issue  of  its  form  is  moot. 

V.A.  The  statement  of  the  problem  and  its  algorithm. 

Chapter  II  reparametrized  p(i,j)  to  be  of  the  form 
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where  q^  and  are  unknown  scalars.  This  formulation  is  very 

similar  to  that  of  the  factor  analysis  problem:  the  {q^}  are  the 
items'  factor  loadings  and  the  are  the  subjects'  factor  scores. 

The  absence  of  the  matrix  8  can  be  attributed,  by  this  analogy,  to 
the  indeterminate  linear  transformation  that  may  be  shunted  between  the 
factor  loadings  and  the  factor  scores. 

The  (p(i,j)}  in  V.A.(l)  are  not  fully  identified.  If  we  define 

p(i,j)  »  exp{Q1  A  wj) 

with  the  restrictions  that 

Z  =  0,  for  all  d=l,...,D 

Y  Q'Q  =  and  JW'W=IK  V.A.  (2) 

and  A  =  diag  (A  ,  . . . ,  ,  and  require  ^2  —  ~  t^en 

the  parameters  Q^,  A,  and  are  essentially  identified.  (The 

remaining  ambiguity  takes  place  only  when  some  of  the  A^'s  are 
equal . ) 

The  algorithm  seeks  to  maximize  the  objective  function 

JCCp(Q  A  W)  ,  subject  to  A^  >_  •  •  •  >_  A^  and  V.A.  (2)  V.A. (3) 

where  £  (Q  A  W)  is  such  that 


C  (Q  A  W) 
e  cp 


J 

n 

j-i 


ex 


p{tl  Stk)^  (k)lA  wj} 
k  j 


I  exp{[£  S(a)(k))o  /kJAw’} 

\  coeft'  k  V10  \ 


The  maximization  of  £  over  such  a  high  dimensional  space  (the 
dimension  is  D(I+J-1)  is  impractical  for  a  moderate  number  of  subjects. 
When  I  *  100,  a  typical  number,  Qf  has  four  thousand  orbits.  For 
example,  if  J  =  100,  in  order  to  evaluate  £  ^  even  once,  four  hundred 
thousand  orbits  would  need  be  evaluated.  This  is  not  generally  practical. 

One  natural  modification  is  to  consider  the  smaller  orbits  $l(H,q) 
that  were  described  in  II. B. 2.  If  it  is  desired  that  $1(11, q)  span 
(1-1) -dimensional  space  of  rankings,  then  #$l(H,q)  needs  to  have  at  least 
I  elements.  It  may  be  desired  that  $1(11, q)  contain  its  own  reversals; 
then  $1(11, q)  must  contain  at  least  21  elements. 


51 


V.B.  Variations  to  the  analysis  of  the  unstructured  Q-sort. 

Two  variations  of  the  above  analysis  may  be  posed.  (1)  Rather 
than  having  both  design  matrices  unknown,  only  one  design  matrix  may 
be  unknown;  the  other  is  specified.  (2)  Having  estimated  one  or  both 
of  the  matrices  Q  and  W,  rotations  to  simple  structure  are  often 
desirable  in  order  to  ease  the  interpretation  of  the  factor  structure. 

V.B.l.  The  factor  analysis  given  one  specified  design  matrix. 

Because  the  problem  of  estimating  the  subject  design  matrix 
when  the  item  design  matrix  is  known  is  closely  parallel  to  that  of 
estimating  the  item  design  matrix  when  the  subject  design  matrix  is 
specified,  only  the  latter  will  be  discussed. 

Since  rr  W’W  need  not  be  the  identity  matrix  consider  the  trans- 

J 

formation  T  so  that  ^  (TW)'(TW)  =  I.  Let  X  =  TW. 

The  objective  function  is  then 

£(Q  A  X’)  subject  to  i  Q'Q  -  I;  £  Q±d  =  0,  for  all  d;  V.B.l(l) 

and  X  >_  X2  >_•••>  XD  >.  0 

and  we  maximize  V.B.l(l)  with  respect  to  Q  and  A  and  use  the 
maximizing  values  as  the  magnitudes  and  directions  of  the  factor 
structure.  8  *  AT*  ^  may  be  interpreted  as  the  parameter  matrix 
relating  the  factor  loadings  Q  to  the  "factor  scores"  W. 
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V.B.2.  Rotations  to  simple  structure. 

The  restrictions  V.A,(2)  are  only  necessary  for  technical  reasons  — 
to  ensure  that  Q,  A,  and  W  are  identifiable.  These  estimates  need  not 
be  easy  to  interpret,  however;  one  may  wish  to  exploit  their  lack  of  defini¬ 
tion  by  selecting  orthogonal  rotations  of  Q  and  W  to  make  their  structure 
more  comprehensible.  In  the  raid  fifties  several  criteria  for  simple 
structure  that  furnish  precise  algorithms  were  proposed  (Ferguson  [1954], 
Carroll  [1953],  Neuhaus  and  Wrigley  [1954],  Saunders  [1953],  and  Kaiser 
[1956]). 

Each  of  these  rotations  to  simple  structure  operates  on  the  factor 
loadings.  As  a  result,  the  factor  loading  matrix  Q  (and  its  dual, 

the  factor  score  matrix  W)  is  in  the  correct  form  to  be  rotated  by 

VARIMAX,  QUARTIMAX,  or  whatever;  that  the  factor  loading  matrix  is  the 
result  of  maximizing  a  cyclic  prime  likelihood  is  not  relevant. 

However,  the  matrix  A  needs  to  be  transformed  if  either  Q  or 

W  are  rotated.  If  Q(R)  =  QR  is  the  result  of  the  rotation  R,  and 

if  W(S)  =  WS  is  the  result  of  the  rotation  S,  then  A  needs  to  be 
replaced  by  B  =  R*  A  S;  thus 

log  p(i, j)  *  (Q  A  W)  -  Q(R)(Rf  A  S)  W(S)^  . 

Therefore,  the  simplicity  of  any  rotations  of  Q  or  W  needs  to  be  weighed 
against  the  complexity  such  transformations  may  induce  on  the  matrix  6. 
Incorporating  such  an  index  of  the  simplicity  of  B  into  the  algorithm  opti- 
mizating  the  simplicity  of  Q  and  W  seems  appropriate. 
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VI.  An  example. 


The  primary  consideration  in  developing  a  class  of  likelihood  models 
conditional  on  random  shufflers  is  computational  feasibility.  Even  so, 
maximizing  conditional  likelihood  functions  remains  CPU  intensive;  just 
how  intensive  is  best  illustrated  by  a  practical  example. 

VI. A.  Description  of  origin  of  the  data. 

The  data  was  provided  by  Phyllis  Sherlock,  PhD;  the  reader  is 
referred  to  Sherlock  (1980)  for  the  substantive  details  of  the  origin 
of  the  data.  For  the  present  purpose  of  providing  a  practical  example 
of  the  analysis,  the  following  summary  of  Sherlock’s  design  of  the 
Q-set  is  provided. 

(1)  The  theoretical  background  for  the  structure  of  the 
Q-set  is  the  typology  of  female  psychologies  of  Toni  Wolff, 
who  developed  it  in  the  framework  of  the  analytical  psychology 
of  Carl  Jung.  Four  types  of  psychologies  are  postulated: 

The  Mother,  the  Amazon,  the  Hetaira,  and  the  Medium.  These 
four  types  are  arranged  as  two  bipolar  pairs  -  the  "Mother- 
Hetaira"  and  the  "Amazon-Medium11;  these  two  bipolarities 
compose  a  coordinate  system  of  two  orthogonal  axes.  See 
Figure  1. 


Med  ium 


Figure  1. 


Mother 


Amazon 


Hetaira 
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The  point  at  which  these  axes  intersect  is  the  origin  in 
this  coordinate  space,  the  bipolarities  are  the  coordinate 
directions.  In  this  theoretical  model,  the  psychology  of 
any  particular  female  corresponds  to  a  point  in  this  two- 
dimensional  space. 

(2)  Based  on  Wolff’s  descriptions  of  each  of  these  four 
typologies,  Sherlock  developed  a  set  of  Q-items,  each  item 
consisting  of  simple  adjectives  or  short  phrases.  For  each 
typology,  Sherlock  associated  twenty-five  Q-items;  the  Q-set 
was  composed  of  these  four  groups  of  twenty-five  items  -  one 
hundred  items  in  all. 

The  item  design  matrix,  Q,  was  generated  in  the  following  way: 

(3)  Sherlock  had  four  experts  in  Wolff’s  typology  each  sort 
the  Q-set  four  times,  one  sort  for  each  of  the  female  types. 

The  experts  were  told  to  sort  for  the  ideal  Mother,  the  ideal 
Amazon,  and  so  on. 

(4)  Based  on  this  expert  data,  the  item  design  matrix  was 
constructed  as  follows: 

(a)  Within  each  expert,  the  four  scores  of  each  item 
were  centered  to  have  mean  zero  across  the  four  conditions 
(the  conditions  of  sorting  for  the  ideal  Mother,  the  ideal 
Amazon , . » • ) • 

(b)  For  each  condition,  the  scores  were  averaged  across 
raters,  giving  a  design  matrix  of  rank  four. 
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(c)  Each  column  of  this  matrix  was  then  "centered",  that 
is,  made  orthogonal  to  the  1  x  100  vector,  each  of  whose 
elements  is  unity.  The  resulting  design  matrix  had  four 
columns  but  had  rank  three. 

(d)  This  design  matrix  was  in  turn  transformed  to  reflect 
the  theoretical  coordinate  structure.  By  subtracting  the 
design  column  corresponding  to  the  Hetaira  type  from  the 
column  corresponding  to  the  Mother  type,  a  design  column 
representing  the  ordinate  (Mother-Hetaira)  was  produced. 

By  subtracting  the  column  corresponding  to  the  Medium  type 
from  that  corresponding  to  the  Amazon  type,  a  design  column 
representing  the  abscissa  was  formed.  (The  design  could  have 
been  saturated  by  including  the  column  consisting  of  Mother  + 
Hetaira  -  Amazon  -  Medium,  but  for  simplicty  this  was 

not  done.) 

The  subject  design  matrix,  W,  was  chosen  as  follows: 

(5)  A  constant  covariate  was  included  to  reflect  the  overall 
propensity  for  the  subjects  sampled  to  by  any  particular  type. 

(6)  Two  scores  from  the  Meyers-Briggs  inventory  were  included 
to  reflect  some  of  Sherlock’s  hypotheses.  The  Meyers-Briggs 
is  a  paper  and  pencil  type  questionnaire  designed  to  measure 
four  traits  central  to  Jung’s  personality  theory.  The  traits 
are:  extraversion-introversion  (E-I) ,  thinking-feeling  (T-F) , 
sensation-intuition  (S-N) ,  and  judging -perceiving  (J-P) .  The 
scores  E-I  and  J-P,  with  the  intercept,  made  up  the  subject 
design  matrix. 
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(7)  Sherlock  collected  Q-sorts  from  80  Individuals.  Of  these, 
three  were  rejected  from  the  computer  runs  because  of  coding 
aberrations.  Therefore,  the  analysis  below  is  based  on  77  Q-sorts. 


VI. B.  Description  of  the  technical  characteristics  of  the  algorithm. 


The  analysis  presented  below  was  implemented  on  the  IBM  3033 
computer  located  at  Stanford’s  Center  for  Information  Technology.  The 
source  code  was  written  in  FORTRAN  and  compiled  at  level  G.  Several 
IMSL  routines  were  employed  to  perform  some  of  the  standard  transforma¬ 
tions  of  matrices  required.  The  algorithm  is  a  "protected11  Newton- 
Raphson  iterative  scheme,  modified  to  ensure  that  each  refinement  brings 
an  increase  in  the  likelihood. 

For  the  data  set  described  in  section  VI. A,  each  iteration,  consis¬ 
ting  of  an  evaluation  of  its  likelihood,  its  gradient,  and  its  matrix  of 
partials  takes  approximately  0.20  CPU  minutes.  Eleven  iterations  were 
required  to  locate  the  solution  reported  below.  The  default  allocation 
of  256  Kilobytes  of  core  memory  was  adequate. 

The  orbit  generator  employed  consisted  of  the  200  shufflers  whose 
three  generators  are  the  following: 

o>13  «  (0,  13,  26, . . 87)  f 

that  is,  co^3(r)  ■  13*r  (mod  100), 

e  =  (99,  98,  97,. ..,1,  0)  , 

and 

20 

c  «  (20,  21,  22,.. .,99,  0,  1,  2,. ..,19)  . 

The  group  generated  by  has  twenty  elements,  that  generated  by  e 

20 

two  elements,  and  that  generated  by  c  five,  for  a  total  of 
20x2><  5  *  200  elements. 
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VI. C.  Description  of  the  results. 


In  maximizing  the  likelihood,  its  logarithm  went  from  -407.97  at 
3=0  to  -407.56  at  its  maximum,  a  change  corresponding  to  a  chi-square 
with  six  degrees  of  freedom  of  0.82.  From  this  one  may  conclude  that 
the  model  fitted  did  not  significantly  differ  from  the  3-0  model.  One 
should  add,  however,  that  Sherlock  had  no  strong  hypotheses  about  the 
relation  of  Wolff's  typology  to  either  of  the  two  scores  from  the  Meyer- 
Briggs;  the  lack  of  any  significant  effects  has,  therefore,  no  particular 
impact  on  the  validity  of  Wolff’s  typology. 

The  format  of  the  answers  that  this  likelihood  model  produces  should 
be  of  interest  to  any  who  seek  to  build  predictive  models  of  structured 
Q-sort  data.  Essentially  this  format  has  three  features: 

(1)  Coefficients  are  fit  in  a  manner  that  allows  them  to 
be  interpreted  as  regression  coefficients;  they  may  be 
standardized.  See  Table  1. 

(2)  The  coefficients  allow  a  pair  of  dual  visualizations:  The 
underlying  dimensions  of  the  item  design  space  may  be  represented 
as  coordinates  in  the  subject  design  space.  Similarly  the  under¬ 
lying  dimensions  of  the  subject  design  space  may  be  represented 
as  coordinates  in  the  item  design  space.  See  Figures  2  and  3. 

This  is  the  duality  described  in  section  II. A. 1. 

(3)  The  variances  and  correlations  of  the  coefficients  are  obtained 
from  the  Fisher  information  matrix.  See  Table  2. 
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Figure  2.  Location  of  the  covariates  of  the  subject  design 
matrix  in  the  item  design  space. 


Judging 


1 


Judging  f 


Amazon 


-5x  10-3 


Introversion  -5x10 


-3 


5x  10 


-3 


Extraversion 


-5x  10 


Mother 


^  Perceiving 


Figure  3.  Location  of  the  features  of  the  item  design  matrix  in 
the  subject  design  space. 
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Table  2.  The  Fisher  Information  Matrix  of  the  Coefficients. 


(Mother-Hetaira) _ (Amazon-Medium) 


x( Intercept) 

x(E-I) 

x(J-P) 

x(Intercept)  x(E-I)  x(J-P) 

x  Intercept  .529 

x  E-I  .00454 

.0314 

x  J-P  -.00432 

.0196 

.0306 

x  Intercept  .448 

.00349 

-.00303 

.528 

x  E-I  .00293 

.0312 

.0196 

.00507 

.0314 

x  J-P  -.00368 

.0195 

.0304 

-.00364 

.0195  .0304 
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VII.  Conclusion. 


In  the  previous  six  chapters  various  aspects  of  a  statistical 
methodology  have  been  described;  now,  in  closing,  an  overview  seems 
appropriate.  While  never  made  explicit,  the  models  developed  here 
parallel  those  of  classical  multivariate  statistical  analyrs.  Three 
similarities  are  the  following: 

(1)  The  multivariate  normal  distribution  is  the  central  object  of 
study  in,  say,  Anderson  (1958).  Similarly,  the  sampling  function  p^(*)» 
derived  in  chapter  I  and  parametrized  in  chapter  II,  occupies  a  key 
position.  Naturally,  the  consequences  of  assuming  the  form  of  p ^ ( . )  need 
to  be  critically  evaluated,  as  do  the  consequences  of  assuming  multi¬ 
variate  normality.  The  axiomatic  development  of  chapter  I  is  presented 

to  elucidate  some  of  these  issues. 

(2)  The  parametr izat ion  of  p ^ ( * )  in  the  initial  section  of 
chapter  II  is  rather  analogous  in  form  to  the  multivariate  general  linear 
hypothesis  (Anderson,  chapter  8  [1958])  of  classical  multivariate  analysis. 
Indeed,  the  practical  import  of  both  models  is  to  pose  to  the  consumer  of 
statistical  analyses  a  relatively  simple  problem:  the  specification  of 
relevant  predictors.  A  distinction  between  the  two  is  that  the  Q-sort 
model  presented  here  poses  a  "double"  specification  problem.  Not  only 
must  relevant  predictors  (W)  be  specified,  so  also  must  descriptions  of 
the  items  (Q)  be  provided. 

(3)  Toward  this  end,  the  factor  models  of  chapter  V  are  presented. 

In  the  instance  when  both  design  matrices  are  being  estimated,  the  result 
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resembles  Hotelling’s  canonical  correlation  analysis  (see  Anderson, 
chapter  12  [1958]),  When,  however,  one  of  the  two  matrices  is  fixed, 
the  result  is  more  analogous  to  principal  components  (Anderson, 
chapter  11  [1958]).  In  practice,  both  principal  components  and  canonical 
correlations  are  used  to  aid  in  reducing  data  and  specifying  models; 
hopefully,  so  shall  these  factor  models. 

Aside  from  paralleling  classical  multivariate  analysis,  the  present 
work,  in  particular  chapter  I,  contributes  in  a  minor  way  to  the  body  of 
mathematical  models  that  describe  preference  and  selection  behavior.  The 
sampling  function  derived  in  chapter  I  is  sufficiently  similar  to  Luce’s 
model  that  comparisons  are  meaningful  while  sufficiently  different  that 
these  comparisons  are  interesting. 

The  sampling  function  of  the  Q-sort ing  model  compares  to  that  of 
Luce  on  the  following  points;  (1)  Both  models  express  a  notion  of 
"independence  of  irrelevant  alternatives",  but  (2)  only  for  Luce’s  model 
is  strong  stochastic  transitivity  an  immediate  consequence.  (3)  Both 
models  conceptualize  the  preference  ordering  activity  as  stochastic  but 
only  the  Q-sort  model  is  palindrome  invariant.  (4)  Finally,  Luce’s 
model  is  a  direct  consequence  of  assuming  a  stability  to  a  changing 
"context",  i.e.  a  changing  assortment  of  "irrelevant"  alternatives. 

The  Q-sorting  model  makes  no  such  assumption.  For  these  reasons,  the 
Q-sort ing  model  is  a  useful  theoretical  "foil"  against  which  Luce’s  model 
may  be  better  understood.  And  as  with  all  foils,  it  would  be  of  consi¬ 
derably  less  interest  were  such  contrasts  not  possible. 
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To  sum  up,  this  work  contributes  in  two  ways.  First  and  primarily, 
it  develops  a  methodology  for  analyzing  Q-sort  data.  Second  and 
secondarily,  it  adds  a  new  aspect  to  the  theoretical  investigation 
of  preference  behavior. 
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